Text-prompted speaker verification experiments with phoneme specific MLPs
نویسندگان
چکیده
The aims of the study described in this paper are (1) to assess the relative speaker discriminant properties of phonemes and (2) to investigate the importance of the temporal frame-to-frame information for speaker modelling in the framework of a text-prompted speaker verification system using Hidden Markov Models (HMMs) and Multi Layer Perceptrons (MLPs). It is shown that, with similar experimental conditions, nasals, fricatives and vowels convey more speaker specific informations than plosives and liquids. Regarding the influence of the frame-to-frame temporal information, significant improvements are reported from the inclusion of several acoustic frames at the input of the MLPs. Results tend also to show that each phoneme has its optimal MLP context size giving the best Equal Error Rate (EER).
منابع مشابه
A robust speaker verification system against imposture using an HMM-based speech synthesis system
This paper describes a text-prompted speaker verification system which is robust to imposture using synthetic speech generated by an HMM-based speech synthesis system. In the verification system, text and speaker are verified separately. Text verification is based on phoneme recognition using HMM, and speaker verification is based on GMM. To discriminate synthetic speech from natural speech, an...
متن کاملAn Alternative To Silence Removal For Text-Independent Speaker Verification
State-of-the-art text independent speaker verification systems use silence/speech detectors to get rid of silence frames which are considered to be non discriminative. This paper explores a possible replacement to this silence/speech detector by considering each Gaussian of a GMM as modeling a specific speech class and by using discriminant models like SVMs and MLPs in order to fuse the corresp...
متن کاملUsing phoneme recognition and text-dependent speaker verification to improve speaker segmentation for Chinese speech
Speaker segmentation is widely used in many tasks such as multi-speaker detection and speaker tracking. The segmentation performance depends on the performance of speaker verification (SV) between two short utterances to a large extent, so the improvement of the SV performance for short utterances would give the segmentation performance a great help. In this paper, a method based on phoneme rec...
متن کاملInvestigation of Frame Alignments for GMM-based Text-prompted Speaker Verification
The frame alignment acts as an important role in GMM-based speaker verification. In text-prompted speaker verification, it is common practice to use the transcriptions to align speech frames to phonetic units. In this paper, we compare the performance of alignments from hidden Markov model (HMM) and deep neural network (DNN), using the same training data and phonetic units. We incorporate a pho...
متن کاملA Chinese phoneme clustering theory and its application to a text independent speaker verification system
This paper presents a new idea of Chinese phoneme clustering and a text independent speaker verification system with this technique applied. It changes the way of conventional verification method with averaging features used, instead, both the dynamic and static features of speech are included in our new method. Also it leads to fast and efficient clustering algorithm in the training phase. The...
متن کامل